Segmental speech coding model for storage applications

نویسندگان

Anssi Rämö

Jani Nurminen

Sakari Himanen

Ari Heikkinen

چکیده

This paper introduces a novel speech coder structure for storage applications operating at low bit rates. The coder exploits the inherent segmental nature of speech signals by dividing the input into segments of variable length. Quite often the length of the segment is the same as the length of the phoneme. The individual segments are coded using adaptive techniques that take into account the relative perceptual importance of different types of speech, e.g. voiced and unvoiced speech. These main features of the proposed approach are enabled by the fact that many of the design constraints related to real-time conversational speech can be relaxed in storage applications. A practical implementation containing the speech-adaptive segmentation is described and its performance is verified in a listening test at average bit rates of about 1.0 kbps and 2.4 kbps respectively. The results show that the segmental model significantly improves the coding efficiency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmental feature extraction and coding for speech synthesis

This paper describes a segmental feature extraction and speech coding method in an acousticarticulatory domain using nomograms that represent a mapping between formant frequencies and articulatory parameters. The vocal tract model is a modified Fant model, in which we newly introduced a parameter for successively adjusting vocal tract lengths. We investigated first the relationship between form...

متن کامل

Segmental Featurs Extraction and Coding for Speech Synthesis

متن کامل

Model-Based Speech Signal Coding Using Optimized Temporal Decomposition for Storage and Broadcasting Applications

A dynamic programming-based optimization strategy for a temporal decomposition (TD) model of speech and its application to low-rate speech coding in storage and broadcasting is presented. In previous work with the spectral stability-based event localizing (SBEL) TD algorithm, the event localization was performed based on a spectral stability criterion. Although this approach gave reasonably goo...

متن کامل

Speech coding using mixture of gaussians polynomial model

We have investigated a novel method of spectral estimation based on mixture of Gaussians in a sinusoidal analysis and synthesis framework. After quantisation of this parametric scheme a xed frame-rate coder operating at a bit-rate of around 2.4 kbits/s has been developed. This paper describes an extension to this spectral model based on constraining the parameters of the mixture of Gaussians to...

متن کامل

LPC Quantization and Interpolation in Coding for Speech Storage Applications

In this paper, personalized quantization of the filter coefficients of Linear Predictive Coding (LPC) is studied. The study covers two aspects. On the one hand, a signal-adaptive algorithm which determines when to transmit a set of LPC coefficients is introduced. This algorithm allows a reduction of about 35% of the bit rate needed to code the LPC coefficients in speech storage applications – e...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Segmental speech coding model for storage applications

نویسندگان

چکیده

منابع مشابه

Segmental feature extraction and coding for speech synthesis

Segmental Featurs Extraction and Coding for Speech Synthesis

Model-Based Speech Signal Coding Using Optimized Temporal Decomposition for Storage and Broadcasting Applications

Speech coding using mixture of gaussians polynomial model

LPC Quantization and Interpolation in Coding for Speech Storage Applications

عنوان ژورنال:

اشتراک گذاری